Using the 1–0 coding for the chord templates, I generated a Keygram
for Reinout’s second track. This Keygram makes use of the new helper
function compmus_match_pitch_templates, which compares the
averaged chroma vectors against templates. Generally, a keygram shows
the progression of chords over time by matching chroma features (pitch
class profiles) to predefined chord templates. The visualization
represents which chords are most likely at each point in time. For
instance, for the chosen track, dark colors, such as at the start of the
track and around the 70-80 second range, represent short distances /
differences between the recorded chords and the template. keygrams help
identify harmonic progressions, modulations, and changes in harmony over
time.
This is the same keygram generateed using Temperley’s proposed improvements. It reveals more or less similar insights, but generally, Temperely’s improvements imply clearer or more stable key regions, assign higher weights to stable scale degrees (tonic, dominant), and reflect more natural tonal hierarchies
Timbre features, often represented by MFCCs (Mel-Frequency Cepstral Coefficients), capture the spectral characteristics of the sound. This timbre-based self-similarity chart highlights instrumental changes and overall sound quality / production shifts.The effectiveness of chroma- or timbre-based self-similarity for structural analysis depends on the specific characteristics of the track: While the chroma-based self-similarity captures harmonic progressions, tonal structure, key changes, and chord progressions (providing clearer structural pictures for tracks with harmonic content such as pop , jazz and classical music), timbre-based self-similarity captures instrumental texture and sound quality, outlining changes in orchestration, dynamics, and articulation. Because my chosen track is a more electronic / EDM song, its timbre features are at the forefront, making the timbre-based self-similarity chart more insightful. The timbre-based self-similarity chart portrays the tracks repeated instrumental sections through the prominent diagonal lines with sudden shifts indicating timbral changes (the introduction of new instruments)
Welcome to my Computational Musicology portfolio for 2025! This storyboard contains my perspective on the examples from each week.
This is the bad visualisation of the AI Song Contest we used in our first lab session, this time in a dashboard.
To improve and build upon the first visualization, I sought to formulate a story by improving the look of the visualization, making it more readable. I went about doing this by:
geom_rug() that did not add much
valueThis updated version:
✅ Clearly shows tempo vs. arousal trends
✅ Uses color and size effectively
✅ Highlights overall trends with a dashed trend line
✅ Has a clean and readable layout
I decided on exploring different genres that I like by asking different models for various genres of songs. I have always been a fan of post-punk, alternative rock music from the late 90s to early 2000s, such as Interpol, The Strokes, Bloc Party, Arctic Monkeys, Fontaines D.C. With the help of AI, I came up with this description to use as a prompt for gen AI music models: a high-energy alt rock / post-punk song with a melodic bassline, intricate drumming, and sharp and rhythmic guitar work, reminiscent of the bands Interpol and Bloc Party. dynamic, with tension-building verses leading into an explosive, anthemic chorus. create a sense of depth and intensity.” I also used this shortened version for models with character limits “Post-Punk, Driving Melodic Bassline, Angular Reverb-Drenched Guitars, Punchy Dynamic Drumming, Moody Detached Vocals, Urgent & Anthemic, Dark Yet Energetic, Tension-Building Composition, 140 BPM”. I have also recently been enjoying deep house music, so I decided to choose this genre as one of my songs. I used the following prompts, “deep house song that has a hypnotic beat, gradually layering warm synths, deep basslines, and subtle percussive beats, with a steady, entrancing rhythm. slow, cinematic build-ups that evoke nostalgia and euphoria. Incorporate atmospheric pads and a shimmering, time-dissolving feel of the track, with immersive, and emotionally uplifting verses and bridges, suitable for a sunset in the mountains” and “Deep House, Hypnotic Synth Pads, Pulsing Bassline, Rolling Four-on-the-Floor Groove, Atmospheric Textures, Slow-Building Progression, Dreamy Vocal Samples, Cinematic, Nostalgic, Expansive, 120 BPM”. I wanted to try a different genre that I also enjoy, something along the lines of Lana del Rey’s style. So I used the following prompts, “A cinematic baroque / dream pop composition that blends dreamy electronic synths with classical orchestration. Feature violins, melancholic clarinets, and rich trumpet swells, weaving through ethereal synth pads and delicate, reverberated piano. The rhythm should be slow and hypnotic, with a hazy, dreamlike quality. The vocals should be intimate yet grand, drenched in vintage-style reverb, with poetic, melancholic lyrics evoking themes of romance, nostalgia, and faded Hollywood glamour. Think of Lana Del Rey’s storytelling style, but with a modern dream pop twist—layered harmonies, sweeping crescendos, and an air of cinematic longing .” and “Baroque Pop / Dream Pop, Ethereal Synth Pads, Sweeping Violin & Clarinet Arrangements, Melancholic Trumpet Swells, Reverb-Drenched Intimate Vocals, Vintage Aesthetic, Poetic & Nostalgic, Cinematic & Grand, 80 BPM”. I explored the outputs of these prompts from various models including Suno, Stable Audio, Beatoven.ai, Soundverse.ai, Udio, and Mubert. Although I was hesitant to try Suno and Udio given their use of artists’ music without compensating them, I wanted to see whether there would be any differences in the quality, production output, relevance to the prompt, and similarity to expectations and existing songs. I ended up deciding on the Stable Audio deep house track because it seemed to best match my expectations of emotive, intense, while also calming and not sounding too elaborate. I found the vocal and lyrical qualities of most models to be of somewhat lower quality than I was expecting, with many songs sounding unnatural or AI generated (understandably). I ended up deciding to go with another deep house song that I generated on Suno.
The Essentia features of own tracks were not available for analysis, so for the basis of the following visualizations, I used a fellow student’s tracks, Reinout W, which I personally enjoyed, found intriguing, and was curious what computational analysis may reveal about. This chromagram of a Reinout’s second track, which reveals the distribution of the 12 musical pitch classes of the selected song. It shows relatively high instances of the C, D, E, G, and A pitch classes, with lower instances of the C#/Db, D#/Eb, F, F#Gb, G#/Ab, and A#/Bb classes. A chromagram represents pitch class content regardless of octave, making it useful for identifying harmonic structure and key
I modified the template code by:
norm parameter, which affects how the
chroma features are normalized: I chose the manhattan norm
to retain musical structureThis cepstrogram of the same track reveals a visual representation of the cepstrum of a signal over time, used for timbre analysis. It works by: computing a spectrogram (amplitude of a Fourier transform), then applying a second inverse Fourier Transform (cepstrum) - which is the result of the logarithm of the estimated signal spectrum
I modified the template code by:
norm parameter, which affects how the
chroma features are normalized: I chose the euclidean norm,
sued for high-dimensional data and to emphasize clarity.incomplete